from IPython.display import Image
Image(filename='../imgs/banner.png')
%load_ext pretty_jupyter

Authors:

  • Bailey Passmore, Data Scientist, HRDAG
  • Larry Barrett, Consultant, HRDAG
import pendulum

Tuesday, 11 March 2025 at 07:26 PM (PDT)

%%html

<style>
    #Styling {
        font-weight: bold;
        font-family: Helvetica;
    }
</style>

Goal

  • What we have in dev-pre-restructure is working fine but we're on a short deadline and need to streamline the data. Let's make a barebones table of the core data we need for analysis, including:
    • event_no
    • date_occurred (w/ year_occurred)
    • source_type (a field derived from the initial event type in the data, either 911 call reporting gunfire or Shotspotter GDT alert)
    • event_location (service address if specified)
    • date_dispatched (is this the same as date of arrival?)
    • area (the result of mapping the reported police district to the police area)

Research questions we're working towards

  1. Is dispatch reported at the same rate for all districts?
  2. RE: Soundthinking / Brookings Institute claim that some 80% of gunfire events do not get reported by citizens - is that True in Chicago?
    • When SST alerts aren't matched to 911 calls, what is the typical disposition of such an alert?
    • When 911 calls aren't matched to SST alert, what is the typical disposition?

Time period covered

  • Earliest date occurred included: '2021-01-01'
  • Last date occurred included: '2024-09-23' (Per CPD, "ShotSpotter technology for the City of Chicago was discontinued in September 2024, meaning no records are available beyond that date.") The last date a ShotSpotter alert appears in the data is 23 September 2024, so we include records up to and including this date.

Identifying Event type (based on the _inittype field)

shotspotter = ['SST', 'PSST', 'MSST'] # keywords provided by CPD in Info sheet
citizencalls = ['SHOTS', 'SHOTSF', 'PERSHO',] # Note: 'PERGUN', 'PERDOW','PERHLP', 'DOMBAT', etc. excluded

Note about what humans can do that SST can't

In the OEMC data, in addition to reports originating as calls about shots fired ('SHOTSF') or persons being shot ('PERSHO'), we also see events with an initial label 'PERGUN' and final label referring to shots fired or someone shot. This tells us that not only do Chicagoans report gunfire in general, they may also report early warning signs of conflict involving firearms before any shots occur, giving first responders a head start to arrive on scene and provide potentially life-saving care.

# dependencies
import re
import numpy as np
from datetime import timedelta
import pandas as pd
import fuzzywuzzy as fuzz

from geopy.geocoders import Nominatim
from geopy.distance import geodesic
# support methods
def format_count(v):
    return "{:,}".format(v)


def format_prop(prop, decn=1, asperc=True):
    if asperc: prop = prop*100
    return "{}%".format(round(prop, decn))


def report_fields(df, idcol, cols, fillna=False, headn=10):
    data = df[[idcol] + cols].drop_duplicates()
    if fillna:
        count = data[cols].fillna('None reported').value_counts().to_frame().reset_index()
        perc = data[cols].fillna('None reported').value_counts(normalize=True).to_frame().reset_index(
            ).rename(columns={'proportion': 'percent'})
    else:
        count = data[cols].value_counts().to_frame().reset_index()
        perc = data[cols].value_counts(normalize=True).to_frame().reset_index(
            ).rename(columns={'proportion': 'percent'})
    count['count'] = count['count'].apply(format_count)
    perc.percent = perc.percent.apply(format_prop)
    out = pd.merge(count, perc, on=cols)
    return out.head(headn)


def report_overtime(df, idcol, yearcol):
    yearly = df[[idcol, yearcol]].groupby(yearcol).nunique().reset_index()
    yearly[yearcol] = yearly[yearcol].astype(str)
    yearly = yearly.sort_values(yearcol)
    return yearly
# main
colorder = [
    'event_no',
    'date_occurred',
    'date_dispatched',
    'area',
    'location',
    'location_x',
    'location_y',
    'init_type',
    'init_type_desc',
    'fin_type',
    'fin_type_desc',
    'disposition',
    'source_type',
    'early_warning',
]
data = pd.read_parquet("../../merge/output/events.parquet").rename(columns={
    'event_type': 'source_type',
    'event_type_init': 'init_type_desc',
    'event_type_fin': 'fin_type_desc',})[colorder]
assert data.event_no.nunique() == data.shape[0]
data['year_occurred'] = data.date_occurred.dt.year
sst = data.loc[data.source_type == 'ShotSpotter alert']
calls = data.loc[data.source_type == 'Human reporting gunfire']

Review data

Sample emergency event record.
94306
event_no 2109000161
date_occurred 2021-03-31 00:19:48
date_dispatched NaT
area 5
location 32XX N LARAMIE AV /51XX W SCHOOL ST
location_x 41.940
location_y -87.756
init_type SHOTSF
init_type_desc SHOTS FIRED
fin_type SHOTSF
fin_type_desc SHOTS FIRED
disposition None
source_type Human reporting gunfire
early_warning False
year_occurred 2021

Counts

Overall

  • There are 385,448 gunfire-related 911 calls and ShotSpotter alerts prepared for this analysis.
  • The data cover a time period between 2021-01-01 and 2024-09-22.

Source types

  • Frequency table:
source_type count percent
0 Human reporting gunfire 228,552 59.3%
1 ShotSpotter alert 156,896 40.7%
  • Summary: Of the 385,448 emergency events included in the analysis,
    • 228,552 or 59.3% were generated by a 911 call, and
    • 156,896 or 40.7% were generated by a ShotSpotter alert.

Initial Event types

Presented are:

  • the initial event type as reported by OEMC and CPD (init_type),
  • the description of the initial type as found in the data (init_type_desc), and
  • the type of source which reported the event (source_type).
init_type init_type_desc source_type count percent
0 SHOTSF SHOTS FIRED Human reporting gunfire 183,961 47.7%
1 SST SHOT SPOTTER ShotSpotter alert 113,790 29.5%
2 MSST Multiple Shot - ShotSpotter ShotSpotter alert 38,855 10.1%
3 PERSHO PERSON SHOT Human reporting gunfire 31,567 8.2%
4 SHOTS SHOTS FIRED (OV) Human reporting gunfire 13,024 3.4%
5 PSST Probable Shot - ShotSpotter ShotSpotter alert 4,251 1.1%

source_yearly = data[['event_no', 'year_occurred', 'source_type']
    ].groupby(['year_occurred', 'source_type']).nunique().reset_index()
src_ylr_piv = pd.pivot_table(
    source_yearly,
    values="event_no",
    index="year_occurred",
    columns="source_type",
    aggfunc="mean"
)
src_ylr_piv.plot(
    kind='bar',
    title='Emergency Events Observed Over Time',
    xlabel='Year Occurred', ylabel='Record count')
<Axes: title={'center': 'Emergency Events Observed Over Time'}, xlabel='Year Occurred', ylabel='Record count'>

Disposition

In the info page included with the data, the CPD FOIA officer informed us that they had internally identified emergency events from both sources, 911 callers and ShotSpotter, that referred to the same underlying gunfire event, and that the disposition field was only included in the responsive records when ShotSpotter was the first to report. Source

  • Of the 385,448 emergency events included in the analysis, 123,215 or 32.0% have a reported disposition.

5 most frequently reported disposition values

Presented are the 5 most frequently reported `disposition` values for emergency events in which ShotSpotter was the first alert.
disposition count percent
0 MISC.INC./OTH POLICE SER 87,414 70.9%
1 MISC.INC./NO PERSON FND. 15,671 12.7%
2 WEAP VIO/DISC OF FIREA 5,435 4.4%
3 BATTERY:AGGR:HANDGUN 2,746 2.2%
4 ASSAULT;AGG HAND 1,314 1.1%
  • Of the 123,215 emergency events about potential gunfire identified by CPD as first reported by ShotSpotter, 104,585 or 84.9% are labeled as a "Miscellaneous Incident."

Police Areas

Image(filename="../imgs/2019-CPD-Area-Boundaries.webp")
<IPython.core.display.Image object>

Referring to reporting and CPD public data, we map each police district observed in the data to the corresponding police area.

  • Area 1, now called Area Central, will include the 2nd, 3rd, 7th, 8th, and 9th districts on the South Side.
  • Area 2, now called Area South, will include the 4th, 5th, 6th, and 22nd districts on the Far South Side.
  • Area 3, now called Area North, will include the 1st, 12th, 18th, 19th, 20th, and 24th districts on the North Side, largely along the lakefront.
  • Area 4 will include the 10th, 11th, and 15th districts on the West Side.
  • Area 5 will include the 14th, 16th, 17th, and 25th districts on the Northwest Side.

Events by police area

  • The data refer to the police district associated with the call, but we map these to the reported corresponding police area to simplify analyses.
  • Presented are the emergency event counts by police area.
area count percent
0 1 131,142 34.5%
1 2 104,839 27.6%
2 4 69,066 18.2%
3 5 39,722 10.4%
4 3 35,361 9.3%

Source type by area

  • Presented are the record counts by source type and police area.
area source_type count percent
0 1 Human reporting gunfire 68,615 18.1%
1 1 ShotSpotter alert 62,527 16.4%
2 2 Human reporting gunfire 53,770 14.1%
3 2 ShotSpotter alert 51,069 13.4%
4 3 Human reporting gunfire 34,925 9.2%
5 3 ShotSpotter alert 436 0.1%
6 4 Human reporting gunfire 37,357 9.8%
7 4 ShotSpotter alert 31,709 8.3%
8 5 Human reporting gunfire 28,663 7.5%
9 5 ShotSpotter alert 11,059 2.9%

Area 1

  • Presented are the record counts for Area 1, the area with the plurality of events, by source type and year occurred.
source_type year_occurred count percent
0 Human reporting gunfire 2021 20,928 16.0%
1 ShotSpotter alert 2021 16,985 13.0%
2 Human reporting gunfire 2022 19,325 14.7%
3 ShotSpotter alert 2022 15,265 11.6%
4 ShotSpotter alert 2023 18,641 14.2%
5 Human reporting gunfire 2023 17,089 13.0%
6 ShotSpotter alert 2024 11,636 8.9%
7 Human reporting gunfire 2024 11,273 8.6%